Separating wheat from chaff: Diatom taxon selection using an artificial neural network pruning algorithm
نویسنده
چکیده
This study addresses the question of what diatom taxa to include in a modern calibration set based on their relative contribution in a palaeolimnological calibration model. Using a pruning algorithm for Artificial Neural Networks (ANNs) which determines the functionality of individual taxa in terms of model performance, we pruned the Surface Water Acidification Project (SWAP) pH-diatom data-set until the predictive performance of the pruned set (as assessed by a jackknifing procedure) was statistically different from the initial full-set. Our results, based on the validation at each 5% data-set reduction, show that (i) 85% of the taxa can be removed without any effect on the pH model calibration performance, and (ii) that the complexity and the dimensionality reduction of the model by the removal of these non-essential or redundant taxa greatly improve the robustness of the calibration. A comparison between the commonly used ‘‘marginal’’ criteria for inclusion (species tolerance and Hill’s N2) and our functionality criterion shows that the importance of each taxon in an ANN palaeolimnological model calibration does not appear to depend on these marginal characteristics. Introduction grees, transparency as to the way information is extracted from the assemblage data and implemented Several types of algorithm have been proposed to in the predictive model. While it is clear that the develop quantitative inference models in palaeolimpredictive ability of these models can depend on the nology (Birks 1995): Weighted Averaging regression statistical characteristics of the calibration set (dis/calibration (WA) (ter Braak and van Dam 1989; tribution and range of the environmental variable, Birks et al. 1990), Weighted Averaging Partial Least number of samples, number of taxa, etc.), the modelSquare regression (WA-PLS) (ter Braak and Juggins ling approach is also important to the final success of 1993), Gaussian regression and maximum likelihood the model. Although some methods have been shown calibration (ter Braak and van Dam 1989; ter Braak et to outperform others in certain conditions (ter Braak al. 1993; Vasko et al. 2000), and back-propagation and Juggins 1993; ter Braak et al. 1993; ter Braak (BP) (Rumelhart et al. 1986) of Artificial Neural 1995; Racca et al. 2001), little is known about the Networks (ANNs) (Racca et al. 2001). All of these inclusion or exclusion of taxa based on their contribumethods have inherent but different abilities to model tion to the calibration model. Generally, calibration the complex relations between taxon assemblages and data-sets are large and sparse and the criterion for environmental variables and all yield successful pretaxon inclusion is typically ad hoc (e.g., all taxa with dictive models. However, they lack, to varying de1% relative abundance in at least one sample, present
منابع مشابه
Effective Feature Selection for Pre-Cancerous Cervix Lesions Using Artificial Neural Networks
Since most common form of cervical cancer starts with pre-cancerous changes, a flawless detection of these changes becomes an important issue to prevent and treat the cervix cancer. There are 2 ways to stop this disease from developing. One way is to find and treat pre-cancers before they become true cancers, and the other is to prevent the pre-cancers in the first place. The presented approach...
متن کاملAdaptive Predictive Controllers Using a Growing and Pruning RBF Neural Network
An adaptive version of growing and pruning RBF neural network has been used to predict the system output and implement Linear Model-Based Predictive Controller (LMPC) and Non-linear Model-based Predictive Controller (NMPC) strategies. A radial-basis neural network with growing and pruning capabilities is introduced to carry out on-line model identification.An Unscented Kal...
متن کاملIdentifying Flow Units Using an Artificial Neural Network Approach Optimized by the Imperialist Competitive Algorithm
The spatial distribution of petrophysical properties within the reservoirs is one of the most important factors in reservoir characterization. Flow units are the continuous body over a specific reservoir volume within which the geological and petrophysical properties are the same. Accordingly, an accurate prediction of flow units is a major task to achieve a reliable petrophysical description o...
متن کاملOptimizing of Iron Bioleaching from a Contaminated Kaolin Clay by the Use of Artificial Neural Network
In this research, the amount of Iron removal by bioleaching of a kaolin sample with high iron impurity with Aspergillus niger was optimized. In order to study the effect of initial pH, sucrose and spore concentration on iron, oxalic acid and citric acid concentration, more than twenty experiments were performed. The resulted data were utilized to train, validate and test the two layer artificia...
متن کاملEstimation of Cadmium and Uranium in a stream sediment from Eshtehard region in Iran using an Artificial Neural Network
Considering the importance of Cd and U as pollutants of the environment, this study aims to predict the concentrations of these elements in a stream sediment from the Eshtehard region in Iran by means of a developed artificial neural network (ANN) model. The forward selection (FS) method is used to select the input variables and develop hybrid models by ANN. From 45 input candidates, 13 and 14 ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002